model identify
Automatic benchmarking of large multimodal models via iterative experiment programming
Conti, Alessandro, Fini, Enrico, Rota, Paolo, Wang, Yiming, Mancini, Massimiliano, Ricci, Elisa
Assessing the capabilities of large multimodal models (LMMs) often requires the creation of ad-hoc evaluations. Currently, building new benchmarks requires tremendous amounts of manual work for each specific analysis. This makes the evaluation process tedious and costly. In this paper, we present APEx, Automatic Programming of Experiments, the first framework for automatic benchmarking of LMMs. Given a research question expressed in natural language, APEx leverages a large language model (LLM) and a library of pre-specified tools to generate a set of experiments for the model at hand, and progressively compile a scientific report. The report drives the testing procedure: based on the current status of the investigation, APEx chooses which experiments to perform and whether the results are sufficient to draw conclusions. Finally, the LLM refines the report, presenting the results to the user in natural language. Thanks to its modularity, our framework is flexible and extensible as new tools become available. Empirically, APEx reproduces the findings of existing studies while allowing for arbitrary analyses and hypothesis testing.
New Amazon tool helps machine learning models identify unique objects – TechCrunch
Amazon announced a new capability today called Amazon Rekognition Custom Labels to help customers train machine learning models to understand a set of objects when there is a limited set of information. Typically, machine learning models have to work on large data sets to learn something like what's a picture of a dog, as opposed to some other animals. Amazon Rekognition Custom Labels can work with a limited data set to teach the algorithm a group of objects specific to a given use case. "Instead of having to train a model from scratch, which requires specialized machine learning expertise and millions of high-quality labeled images, customers can now use Amazon Rekognition Custom Labels to achieve state-of-the-art performance for their unique image analysis needs," the company wrote in a blog post announcing the new feature. For example, you may want to teach the model to identify a set of engine parts, a limited set of information, which has a lot of meaning to a specific use case.
Accurate AI: Machine learning models identify findings in radiology reports
Machine learning models can identify key information in radiology reports with significant accuracy, according to a new study published in Radiology. The authors used more than 96,000 head CT reports for their research, turning to a bag-of-words (BOW) model to label a small subset of the reports and then allowing the trained algorithms to do the rest. BOW was used because it "discards grammar and context" and "utilizes document-level word occurrences as its features." Overall, machine learning algorithms were able to successfully identify findings in the reports. The model with the best results had a held out area under the receiver operating characteristic curve (AUC) of 0.966 for identifying critical head CT findings and an average AUC of 0.957 for all head CT findings.
- Health & Medicine > Nuclear Medicine (0.95)
- Health & Medicine > Diagnostic Medicine > Imaging (0.95)